Skip to content

dataset: pin validation shard #545

Closed
KartikVashishta wants to merge 4 commits intokarpathy:masterfrom
KartikVashishta:fix/val-shard
Closed

dataset: pin validation shard #545
KartikVashishta wants to merge 4 commits intokarpathy:masterfrom
KartikVashishta:fix/val-shard

Conversation

@KartikVashishta
Copy link
Contributor

Addresses #541 Validation shard is fixed so val loss/bpb is stable regardless of how many shards were downloaded.

Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I had a few comments, inline:

@svlandeg svlandeg added the bug Something isn't working label Feb 19, 2026
@svlandeg svlandeg added the waiting Waiting for user feedback/action label Feb 19, 2026
@svlandeg svlandeg linked an issue Feb 19, 2026 that may be closed by this pull request
@svlandeg svlandeg self-assigned this Feb 20, 2026
@svlandeg svlandeg removed the waiting Waiting for user feedback/action label Feb 20, 2026
Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code diff and ran some quick tests with the branch of this PR and that all looks good, apart from a few nitpicking comments 😉

It makes sense to me to designate shard_01822.parquet as the standard validation one. The doc strings and UX messages have been updated accordingly.

@svlandeg svlandeg removed their assignment Feb 23, 2026
@KartikVashishta
Copy link
Contributor Author

Thanks for the review @svlandeg !

Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has now been fixed by a recent edit on master, so I'll go ahead and close this one 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make validation loss reproducible

2 participants